7 research outputs found

    DEEP LEARNING IN PERSONALIZED MEDICINE: APPLICATIONS IN PATIENT SIMILARITY, PROGNOSIS, AND OPTIMAL TREATMENT SELECTION

    Get PDF
    Two information technology revolutions are colliding in medicine. The first revolution has been the digitalization of health data, specifically Electronic Health Records (EHR). These records contain the details of who we are as patients, our ailments, treatments, and outcomes. Tragically, despite billions of dollars in investment from the US government, hardly any of this data is being utilized to better understand medicine or improve healthcare. This is largely because the data is voluminous, sparse, complex, and poorly formatted; making it unsuitable for traditional analytics methods. However the second revolution, modern Artificial Intelligence, specifically deep learning, provides tools, in the form of algorithms, to address exactly these problems. The primary difference between these modern algorithms and older ones is that the former are able to learn, more or less on their own, how to transform large complex data into a format that makes it easier to use and learn from. In this dissertation, I have developed methods to apply deep learning to digital health data. Doing so, I have shown that we can predict the future health of individual patients with highly complex diseases, produced approaches to understand and leverage what these complex models are learning, and provided a framework for how healthcare systems of the near future could automatically learn to improve care daily. For the first time in history, we are in a position to learn from the combined knowledge of tens of thousands of physicians and their experiences caring for hundreds of millions of patients. The potential transformations to healthcare are difficult to fully fathom, but certainly include safer, more powerful and efficient medicine, and a rapid speed up in new medical discoveries and treatments. Despite the promise, we must proceed carefully, balancing the great need to collectively use our data for better medicine with the individual right to privacy

    Developing the Total Health Profile, a Generalizable Unified Set of Multimorbidity Risk Scores Derived From Machine Learning for Broad Patient Populations: Retrospective Cohort Study

    No full text
    BackgroundMultimorbidity clinical risk scores allow clinicians to quickly assess their patients' health for decision making, often for recommendation to care management programs. However, these scores are limited by several issues: existing multimorbidity scores (1) are generally limited to one data group (eg, diagnoses, labs) and may be missing vital information, (2) are usually limited to specific demographic groups (eg, age), and (3) do not formally provide any granularity in the form of more nuanced multimorbidity risk scores to direct clinician attention. ObjectiveUsing diagnosis, lab, prescription, procedure, and demographic data from electronic health records (EHRs), we developed a physiologically diverse and generalizable set of multimorbidity risk scores. MethodsUsing EHR data from a nationwide cohort of patients, we developed the total health profile, a set of six integrated risk scores reflecting five distinct organ systems and overall health. We selected the occurrence of an inpatient hospital visitation over a 2-year follow-up window, attributable to specific organ systems, as our risk endpoint. Using a physician-curated set of features, we trained six machine learning models on 794,294 patients to predict the calibrated probability of the aforementioned endpoint, producing risk scores for heart, lung, neuro, kidney, and digestive functions and a sixth score for combined risk. We evaluated the scores using a held-out test cohort of 198,574 patients. ResultsStudy patients closely matched national census averages, with a median age of 41 years, a median income of $66,829, and racial averages by zip code of 73.8% White, 5.9% Asian, and 11.9% African American. All models were well calibrated and demonstrated strong performance with areas under the receiver operating curve (AUROCs) of 0.83 for the total health score (THS), 0.89 for heart, 0.86 for lung, 0.84 for neuro, 0.90 for kidney, and 0.83 for digestive functions. There was consistent performance of this scoring system across sexes, diverse patient ages, and zip code income levels. Each model learned to generate predictions by focusing on appropriate clinically relevant patient features, such as heart-related hospitalizations and chronic hypertension diagnosis for the heart model. The THS outperformed the other commonly used multimorbidity scoring systems, specifically the Charlson Comorbidity Index (CCI) and the Elixhauser Comorbidity Index (ECI) overall (AUROCs: THS=0.823, CCI=0.735, ECI=0.649) as well as for every age, sex, and income bracket. Performance improvements were most pronounced for middle-aged and lower-income subgroups. Ablation tests using only diagnosis, prescription, social determinants of health, and lab feature groups, while retaining procedure-related features, showed that the combination of feature groups has the best predictive performance, though only marginally better than the diagnosis-only model on at-risk groups. ConclusionsMassive retrospective EHR data sets have made it possible to use machine learning to build practical multimorbidity risk scores that are highly predictive, personalizable, intuitive to explain, and generalizable across diverse patient populations

    Table_1_Victims of human trafficking and exploitation in the healthcare system: a retrospective study using a large multi-state dataset and ICD-10 codes.XLSX

    No full text
    Trafficking and exploitation for sex or labor affects millions of persons worldwide. To improve healthcare for these patients, in late 2018 new ICD-10 medical diagnosis codes were implemented in the US. These 13 codes include diagnosis of adult and child sexual exploitation, adult and child labor exploitation, and history of exploitation. Here we report on a database search of a large US health insurer that contained approximately 47.1 million patients and 0.9 million provider organizations, not limited to large medical systems. We reported on any diagnosis with the new codes between 2018-09-01 and 2022-09-01. The dataset was found to contain 5,262 instances of the ICD-10 codes. Regression analysis of the codes found a 5.8% increase in the uptake of these codes per year, representing a decline relative to 6.7% annual increase in the data. The codes were used by 1,810 different providers (0.19% of total) for 2,793 patients. Of the patients, 1,248 were recently trafficked, while the remainder had a personal history of exploitation. Of the recent cases, 86% experienced sexual exploitation, 14% labor exploitation and 0.8% both types. These patients were predominantly female (83%) with a median age of 20 (interquartile range: 15–35). The patients were characterized by persistently high prevalence of mental health conditions (including anxiety: 21%, post-traumatic stress disorder: 20%, major depression: 18%), sexually-transmitted infections, and high utilization of the emergency department (ED). The patients’ first report of trafficking occurred most often outside of a hospital or emergency setting (55%), primarily during office and psychiatric visits.</p
    corecore